AITopics | Understanding

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables Wangchunshu Zhou 1

Neural Information Processing SystemsMar-18-2025, 23:25:33 GMT

Recently generating natural language explanations has shown very promising results in not only offering interpretable explanations but also providing additional information and supervision for prediction. However, existing approaches usually require a large set of human annotated explanations for training while collecting a large set of explanations is not only time consuming but also expensive. In this paper, we develop a general framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training. Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model. We develop a variational EM framework for optimization where an explanation generation module and an explanation-augmented prediction module are alternatively optimized and mutually enhance each other. Moreover, we further propose an explanationbased self-training method under this framework for semi-supervised learning. It alternates between assigning pseudo-labels to unlabeled data and generating new explanations to iteratively improve each other.

explanation, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

Europe (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.67)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.62)
(3 more...)

Add feedback

DBR: Divergence-Based Regularization for Debiasing Natural Language Understanding Models

Li, Zihao, Tang, Ruixiang, Cheng, Lu, Wang, Shuaiqiang, Yin, Dawei, Du, Mengnan

arXiv.org Artificial IntelligenceFeb-25-2025

Pre-trained language models (PLMs) have achieved impressive results on various natural language processing tasks. However, recent research has revealed that these models often rely on superficial features and shortcuts instead of developing a genuine understanding of language, especially for natural language understanding (NLU) tasks. Consequently, the models struggle to generalize to out-of-domain data. In this work, we propose Divergence Based Regularization (DBR) to mitigate this shortcut learning behavior. Our method measures the divergence between the output distributions for original examples and examples where shortcut tokens have been masked. This process prevents the model's predictions from being overly influenced by shortcut features or biases. We evaluate our model on three NLU tasks and find that it improves out-of-domain performance with little loss of in-domain accuracy. Our results demonstrate that reducing the reliance on shortcuts and superficial features can enhance the generalization ability of large pre-trained language models.

machine learning, natural language, shortcut token, (17 more...)

arXiv.org Artificial Intelligence

2502.18353

Country: North America > United States > Illinois (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.61)

Add feedback

Unified Language Model Pre-training for Natural Language Understanding and Generation

Li Dong, Nan Yang, Wenhui Wang, Furu Wei, Xiaodong Liu, Yu Wang, Jianfeng Gao, Ming Zhou, Hsiao-Wuen Hon

Neural Information Processing SystemsFeb-11-2025, 23:15:37 GMT

LM) that can be fine-tuned for both natural language understanding and generation tasks. The model is pre-trained using three types of language modeling tasks: unidirectional, bidirectional, and sequence-to-sequence prediction. The unified modeling is achieved by employing a shared Transformer network and utilizing specific self-attention masks to control what context the prediction conditions on.

artificial intelligence, computational linguistic, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > United States (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.68)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.61)

Add feedback

Reviews: Unified Language Model Pre-training for Natural Language Understanding and Generation

Neural Information Processing SystemsFeb-11-2025, 23:15:28 GMT

This paper presents an alternative training regime for the BERT contextual embedding model that incorporates additional conditioning contexts such as left to right language modelling and sequence transduction. The reviewers agree that the work is well motivated and is a reasonable attempt to address some of the issues with the original BERT model. The results are suitably strong, and as such this paper is likely to be of interest to those working on contextual embedding models, although it is puzzling that a classic language modelling perplexity evaluation was not included, given this is one of the objectives that the model optimises. The author's final paper should incorporate the answers to the questions raised by the reviewers.

contextual, reviewer, unified language model pre-training

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Understanding (0.40)

Add feedback

Unified Language Model Pre-training for Natural Language Understanding and Generation

Neural Information Processing SystemsJan-26-2025, 21:51:13 GMT

The code and pre-trained models are available at https://github.com/microsoft/unilm.

absolute improvement, artificial intelligence, natural language, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Understanding (0.45)

Add feedback

Review for NeurIPS paper: Towards Interpretable Natural Language Understanding with Explanations as Latent Variables

Neural Information Processing SystemsJan-24-2025, 04:37:30 GMT

Weaknesses: My main concern is about how explanations are being employed as latent variables. I had assumed based on the introduction that the final predictor would factor through the final explanation. This would provide the faithfulness guarantee that two inputs which produce the same explanation would produce the same output label. However, it seems that during training, the explanation is conditioned on the gold label. The paper points out on L161 that "generating explanations without a predicted label often results in irrelevant and even misleading explanations."

artificial intelligence, explanation, generating explanation, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Understanding (0.40)

Add feedback

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables Wangchunshu Zhou 1

Neural Information Processing SystemsJan-24-2025, 04:37:28 GMT

Recently generating natural language explanations has shown very promising results in not only offering interpretable explanations but also providing additional information and supervision for prediction. However, existing approaches usually require a large set of human annotated explanations for training while collecting a large set of explanations is not only time consuming but also expensive. In this paper, we develop a general framework for interpretable natural language understanding that requires only a small set of human annotated explanations for training. Our framework treats natural language explanations as latent variables that model the underlying reasoning process of a neural model. We develop a variational EM framework for optimization where an explanation generation module and an explanation-augmented prediction module are alternatively optimized and mutually enhance each other. Moreover, we further propose an explanationbased self-training method under this framework for semi-supervised learning. It alternates between assigning pseudo-labels to unlabeled data and generating new explanations to iteratively improve each other.

explanation, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Country:

Europe (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.57)

Add feedback

Review for NeurIPS paper: Towards Interpretable Natural Language Understanding with Explanations as Latent Variables

Neural Information Processing SystemsJan-24-2025, 04:37:22 GMT

This paper proposes an EM framework for explainable language processing. Strength • The idea is new and neat. Weakness • The experiment part and presentation can be further improved. The authors are suggested to further improve the quality of the paper based on the reviewers' comments. NOTE FROM PROGRAM CHAIRS: For the camera-ready version, please expand your broader impact statement to include a more substantive discussion on the potential negative impacts of your work, as well as mitigations.

explanation, latent variable, neurips paper, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language > Understanding (0.40)

Add feedback

An Integrated Platform for Studying Learning with Intelligent Tutoring Systems: CTAT+TutorShop

Aleven, Vincent, Borchers, Conrad, Huang, Yun, Nagashima, Tomohiro, McLaren, Bruce, Carvalho, Paulo, Popescu, Octav, Sewall, Jonathan, Koedinger, Kenneth

arXiv.org Artificial IntelligenceJan-17-2025

Intelligent tutoring systems (ITSs) are effective in helping students learn; further research could make them even more effective. Particularly desirable is research into how students learn with these systems, how these systems best support student learning, and what learning sciences principles are key in ITSs. CTAT+Tutorshop provides a full stack integrated platform that facilitates a complete research lifecycle with ITSs, which includes using ITS data to discover learner challenges, to identify opportunities for system improvements, and to conduct experimental studies. The platform includes authoring tools to support and accelerate development of ITS, which provide automatic data logging in a format compatible with DataShop, an independent site that supports the analysis of ed tech log data to study student learnings. Among the many technology platforms that exist to support learning sciences research, CTAT+Tutorshop may be the only one that offers researchers the possibility to author elements of ITSs, or whole ITSs, as part of designing studies. This platform has been used to develop and conduct an estimated 147 research studies which have run in a wide variety of laboratory and real-world educational settings, including K-12 and higher education, and have addressed a wide range of research questions. This paper presents five case studies of research conducted on the CTAT+Tutorshop platform, and summarizes what has been accomplished and what is possible for future researchers. We reflect on the distinctive elements of this platform that have made it so effective in facilitating a wide range of ITS research.

machine learning, natural language, platform, (16 more...)

arXiv.org Artificial Intelligence

2502.10395

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.87)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting (1.00)
Education > Curriculum (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (0.68)
(2 more...)

Add feedback

Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning

Fang, Ying, He, Bo, Liu, Zhi, Liu, Sannyuya, Yan, Zhonghua, Sun, Jianwen

arXiv.org Artificial IntelligenceDec-22-2024

Xiaomai is an intelligent tutoring system (ITS) designed to help Chinese college students in learning advanced mathematics and preparing for the graduate school math entrance exam. This study investigates two distinctive features within Xiaomai: the incorporation of free-response questions with automatic feedback and the metacognitive element of reflecting on self-made errors. An experiment was conducted to evaluate the impact of these features on mathematics learning. One hundred and twenty college students were recruited and randomly assigned to four conditions: (1) multiple-choice questions without reflection, (2) multiple-choice questions with reflection, (3) free-response questions without reflection, and (4) free-response questions with reflection. Students in the multiple-choice conditions demonstrated better practice performance and learning outcomes compared to their counterparts in the freeresponse conditions. Additionally, the incorporation of error reflection did not yield a significant impact on students' practice performance or learning outcomes. These findings indicate that current design of free-response questions and the metacognitive feature of error reflection do not enhance the efficacy of the math ITS. This study highlights the need for redesign or enhancement of Xiaomai to optimize its effectiveness in facilitating advanced mathematics learning.

artificial intelligence, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-64302-6_24

2412.17265

Country: Asia > China (0.15)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting > Higher Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.88)
Information Technology > Artificial Intelligence > Cognitive Science (0.68)
Information Technology > Artificial Intelligence > Natural Language > Understanding (0.61)

Add feedback

Filters

Collaborating Authors

Understanding

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables Wangchunshu Zhou 1

DBR: Divergence-Based Regularization for Debiasing Natural Language Understanding Models

Unified Language Model Pre-training for Natural Language Understanding and Generation

Reviews: Unified Language Model Pre-training for Natural Language Understanding and Generation

Unified Language Model Pre-training for Natural Language Understanding and Generation

Review for NeurIPS paper: Towards Interpretable Natural Language Understanding with Explanations as Latent Variables

Towards Interpretable Natural Language Understanding with Explanations as Latent Variables Wangchunshu Zhou 1

Review for NeurIPS paper: Towards Interpretable Natural Language Understanding with Explanations as Latent Variables

An Integrated Platform for Studying Learning with Intelligent Tutoring Systems: CTAT+TutorShop

Evaluating the Design Features of an Intelligent Tutoring System for Advanced Mathematics Learning